Skip to content

Mac arm fixes#1794

Merged
Bike merged 9 commits into
mainfrom
mac-arm-fixes
Jun 12, 2026
Merged

Mac arm fixes#1794
Bike merged 9 commits into
mainfrom
mac-arm-fixes

Conversation

@Bike

@Bike Bike commented Jun 12, 2026

Copy link
Copy Markdown
Member

This incorporates most of @dg1sbg's remaining PRs (#1768, #1771, #1786, #1788) relating to Mac:

  • JIT W^X handling in several places
  • Adding the Homebrew include directory to Koga for ARM Macs. why this is directory different on ARM, i'm not sure, but it is.
  • quoting various pathname arguments to shell commands so that pathnames with spaces in them work
  • removing the snapshot encoding of the first byte of relocatable code, since the relocation process can change it

dg1sbg and others added 9 commits June 2, 2026 15:52
On Apple Silicon, snapshot_load fixes up the loaded code's data in place in
MAP_JIT (W^X) memory; the bare stores fault with SIGBUS (KERN_PROTECTION_FAILURE).
Wrap the three load-side write sites in JITDataReadWriteMaybeExecute() /
JITDataReadExecute() (same pattern as the other JIT-literal write sites):
  - the code-literals memcpy,
  - the fixup_objects walk (walk_temporary_root_objects<fixup_objects_t>),
  - the fixup_internals walk.
(The save path fixes up a RW copy of the buffer, so it never hit this.)

This unblocks snapshot load past the W^X SIGBUS. The remaining failure -- the
position-independent function-pointer relocation decode (decodeEntryPoint /
fixedAddress) -- is separate, pre-existing sc598 relocation work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ath quoting

Make snapshot save/load/executable work on macOS arm64 (SLAD-SNAPSHOT and
SLAD-EXECUTABLE now pass):

1. Relocation decode: the entry-point decode validated the resolved address by
   comparing its first byte against a saved firstByte and ABORTED on mismatch
   (decodeEntryPointForCompiledCode) / warned (fixedAddress). But a function's first
   instruction is frequently a relocatable instruction (e.g. ADRP) whose encoded
   immediate bytes legitimately differ between the save-time and load-time JIT
   (different load addresses => different page offsets). The offset/symbol-based decode
   is correct (verified: snapshot loads and returns the right value); drop the
   firstByte equality check, which was a false positive on every relocated first insn.

2. Executable link: save-lisp-and-die :executable builds a clang++ command run via
   system() (a shell); the -Wl,-force_load,<libdir>/libiclasp.a path was unquoted, so
   a build dir containing a space (e.g. a Dropbox path) split the arg and the link
   failed. Quote the output, sectcreate, and force_load paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e -L

The executable link command (run via system()) carried only the RELATIVE
-Lboehmprecise/lib from BUILD_LINKFLAGS, so -lclasp resolved only when
save-lisp-and-die :executable ran with CWD=build/; from any other directory the
link failed ("library not found for -lclasp"). Add an absolute, quoted
-L"<_LibDir>" (as the Linux branch already does). The runtime rpath is already
absolute, so the produced executable both links and runs from any CWD.

Verified: created and ran a standalone executable from /tmp (returns the right
value); SLAD-EXECUTABLE now passes from the repo root, not just build/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two problems prevented `ninja -C build` from working on an Apple Silicon
Mac with a Homebrew toolchain:

* boost (used by clbind/config.h) ships no pkg-config file and is therefore
  not declared as a koga `library`; clasp relies on it being on the default
  include path. units.lisp only added the Intel Homebrew prefix
  `/usr/local/include`, so `/opt/homebrew/include` was never searched and the
  build failed immediately with "'boost/config.hpp' file not found". Add
  `/opt/homebrew/include` for darwin when it exists (probe-guarded, so Intel
  and Linux are unaffected).

* The rpath was emitted as an unquoted `-Wl,-rpath,<abs path>`. When the build
  directory contains a space, ninja passes the flag through /bin/sh which then
  splits it at the space, and the link fails with
  "no such file or directory: '<tail of path>'". Quote the path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On Apple Silicon, the LLVM ORC JITLink memory slab is mapped MAP_JIT and is
write-protected (execute mode) per-thread by default; a thread must call
pthread_jit_write_protect_np(false) before writing to it. clasp writes Lisp
object pointers into each native module's literals vector, which lives in that
JIT memory. Those writes were not bracketed by a switch to write mode, so on
Apple Silicon they fault with SIGBUS (EXC_BAD_ACCESS code=2,
KERN_PROTECTION_FAILURE) -- which manifested as a crash in
loadltv::attr_clasp_module_native while loading freshly compiled native FASLs
during the "Compiling Clasp native image" bootstrap step. (On x86-64 and Linux
the rwx page is genuinely writable, so the bug was latent there.)

The helpers JITDataReadWriteMaybeExecute()/JITDataReadExecute() already exist
for exactly this but had no callers. Bracket every write into JIT-resident
literals memory with them:

* core::core__literals_vset            (compiler.cc)
* llvmo::code_literal_set              (code.cc)
* loadltv::attr_clasp_module_native    (loadltv.cc)
* loadltv::attr_clasp_function_native_estranged (loadltv.cc)
* snapshot-load literal relocation memcpy (snapshotSaveLoad.cc)

Reads of the literals vector are left untouched: MAP_JIT memory is readable in
execute mode, only writes fault. The write-mode window is kept as small as
possible (the bare store) so no JIT code is executed while the thread is in
write mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On Apple Silicon, JIT'd code/data lives in MAP_JIT memory that is
write-protected (execute mode) by default; a thread must switch it to
write mode (pthread_jit_write_protect_np) around any store or the store
faults with SIGBUS (KERN_PROTECTION_FAILURE).

setf_jit_lookup_t (llvmo:jit-lookup-t setf) stores a Lisp function
pointer into a JIT-emitted global. make-callback / clasp-ffi:%defcallback
reach it via (setf (llvm-sys:jit-lookup-t dylib varname) function), and
on arm64-darwin it SIGBUSes at compile time -- this is what makes the
defcallback-native regression test (CFFI-DEFCALLBACK) crash during
compile-file.

Wrap the store in JITDataReadWriteMaybeExecute()/JITDataReadExecute(),
matching the existing W^X guards on the other JIT-literal write sites
(core__literals_vset, loadltv op_setf_literals / attr_clasp_module_native).

Verified on macOS arm64 (native boehmprecise image): CFFI-DEFCALLBACK now
passes (previously a Bus error at compile time); no regressions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The only thing we were using it for is a correctness check. If that
check is not valid due to relocation, there' no point in putting
the byte in.
@Bike Bike merged commit a139427 into main Jun 12, 2026
4 of 6 checks passed
@Bike Bike deleted the mac-arm-fixes branch June 12, 2026 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants